1 TTCGCCTGCG GGCCGGCACT GCTCACCTCT CGTCCAGGGA CATGACGGGC 

51 ACGCCAGGCG CCGTTGCCAC CCGGGATGGC GAGGCCCCCG AGCGCTCCCC 

101 GCCCTGCAGT CCGAGCTACG ACCTCACGGG CAAGGTGATG CTTCTGGGAG 

151 ACACAGGCGT CGGCAAAACA TGTTTCCTGA TCCAATTCAA AGACGGGGCC 

201 TTCCTGTCCG GAACCTTCAT AGCCACCGTC GGCATAGACT TCAGGAAC7U\ 

251 GGTGGTGACT GTGGATGGCG TGAGAGTGAA GCTGCAGATC TGGGACACCG 

301 CTGGGCAGGA ACGGTTCCGA AGCGTCACCC ATGCTTATTA CAGAGATGCT 

351 CAGGCCTTGC TTCTGCTGTA TGACATCACC AACAAATCTT CTTTCGACAA 

4 01 CATCAGGGCC TGGCTCACTG AGATTCATGA GTATGCCCAG AGGGACGTGG 

451 TGATCATGCT GCTAGGCAAC AAGGCGGATA TGAGCAGCGA AAGAGTGATC 

501 CGTTCCGAAG ACGGAGAGAC CTTGGCCAGG GAGTACGGTG TTCCCTTCCT 

551 GGAGACCAGC GCCAAGACTG GCATGAATGT GGAGTTAGCC TTTCTGGCCA 

601 TCGCCAAGGA ACTGAAATAC CGGGCCGGGC ATCAGGCGGA TGAGCCCAGC 

651 TTCCAGATCC GAGACTATGT AGAGTCCCAG AAGAAGCGCT CCAGCTGCTG 

701 CTCCTTCATG TGAATCCCAG GGGGCAGAGA GGAGGCTCTG GAGGCACACA 

7 51 GGATGCAGCC TTCCCCCTCC CAGGCCTGGC TTATTCCAAG AGGCTGAGCC 

801 AATGGGGAGA AAGATGGAGG ACTCACTGCA CAGCCGCTTC CTAGCAGGGA 

851 GCTATACTCC AACTCCTACT TGAGTTCCTG CGGTCTCCCC GCATCCACAG 

901 GGAGGGTAAA ACACTTAGCT TTTATTTTAA TAGTACATAA TTTAATACCA 

951 AAAAAGGCGC CTGGATCCCC AAAAAACCGA GGCTGGGAGC TAGTGGCCCT 

1001 TTTGCTTTCT AGGACTTGGG GGGCCGGCCC TCCCTCCTAA GCATAACAAA 

1051 GGTGGXGT.TG CTCCAGCTCA. GCCCCAGGGG A.CA.CA.GATGC ACTT-TGGGGG 

1101 TGAGGGCAGG TAATGACTCC ATCGCACCCT CAGTTCAGCT GGACAGAGGC 

1151 TCAGGTGACC CCAGCCTTCA CTGTCTCCCG CTCTCCAGGA GCTTATCTTC 

1201 GCCCCATCTC CCAAATAAGT GGGCCCTTGT GCTGTGAGGA AGACCAAAGC 

1251 CTCAGGGAAG ATAAGAGATA TGGAGATGGG AGGGGGAGGA CAAGGGGCAG 

1301 AGAGTAGGGT CTAGCTGGCT ATCTCTGGCC TTACTAACAC CCCCCTGGAG 

1351 GCATGCCCCT TTTCTCCAGC ACACAAGCAC ATTGGGGCAC CTGGAAATAT 

1401 TGGTTCCAGG CTCCTGTTCT CTGGACTTCA GATCCTGGGG GAGCCCCTCC 

1451 CCCCCCTGAA TCCCTGGCTT AGCTACCTTC CTGCCTGTGC ACCTAAAAAC 

1501 CTCAGGTCAG AACTAGGAAA AGAGTTTTGT TTTTATTTTT TTGAAATGGA 

1551 GTCTCGTTCT GTCGCCCAGG CTGAGGTGCA GTAGTGCAAT CTCCGCTCAC 

1601 TACAACCTCC ACTCCCTGGG GCTCAAGCGA TCCTCCCACC TCAGCCGCCG 

1651 AAGTAGCTGG GACTATAGGT GTGTACCATC ACACCTGGCT AATTTTTGTA 

1701 TTTTTTGTAG ACACAGGGTT TCGCCATGTT GCCCAGGCTG GTCTTGAATT 

1751 CCTGAGCTCA AGCAACCTGC CGGCCTCGGC CTCCCAAAGT ACTGGGATTA 

1801 CACGCAGAAG GCACCATGCC CAGGCTAGAT GTGTCTTATC CCAATCCTTT 

1851 GGCAGGCATG CAGCTCCACA GGCGATTTCT TCAAGCAGCT GAAGTGTTTA 

1901 GCCCTCCTGG GTTAAGAGCC AGATAAGGAG AAATCCCTTT CCTAGGTTTG 

1951 GAATGTGTTG TGAAAAAAAA GAGAAATCCC TGGCTCCTGG AGCTGGTGGG 

2001 AGACAAGATT AAGCAAACCT CCCCTGACAT GTATCCCTTT GACCCCAAGC 

2051 TCTGCCTCCT CCCTGACCAC CCATGCCCTT TCCTTTAACT TCTCAAACAG 

2101 ATACCAGGGC CTAAACTGCT TTACCTCCCC TCCTACTGAG TCAGGTTAGG 

2151 TGGTGGGAGG TCACCCATTT CCGAGTTAi\A CCAATGCAAT ATGAGTAAAA 

2201 CAAAGTCATG TGGGTATGTC TGGGGTAGAG AGAGGGGTAG CAAGTTCATG 

2251 TGTCCTCCTT GGTCACATAT CTCCCAAAGC TCTGATCCCT GCCATGGGAA 

2301 GTGGACAGGA AACATGAGGT CATGACCTGC AGGCATCTTT ACTGCAGCTC 

2351 TGCCGGCCTG GAGGGGGAGA GGGGGAGGAA GAAGTATGCG CTGCACATTT 

24 01 CTGAGGCTAC TGCATTTGCT TTCAAGGCAG AAATCTTGCT CTGAGCAGTC 

24 51 AGCGGCTCCA GTTTGGGCCC GATAAGGAAG TTCTCCGTGG CCTCCCTCAG 

2501 GCAGAGCAGG GAGGAGGCTG ACATTGCCAG TCTCTTCTGG GGCCCAAGGC 

2551 AGGTTGCAGG AGATCCAATC CCATAGACAG CTCTGGGCCT CTTGCATTTG 

2 601 AGTTTTTCAG AATTAAACTG CAGTATTTTG GAAAGCAAAA AAAAAAAAAA 

2651 AAAAAAAAAA AAAAAAAAAA AAAA (SEQ ID N0:1) 



FEATURES : 

5'UTR: 1-41 

Start Codon: 42 

Stop Codon: 711 

3'UTR: 714 
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.c:i 



Homologous proteins : 

Top 10 BLAST Hits 



CRAI 10300000151708 

CRAI 1000682330460 

CRAI 18000004977238 

CRAI18000005013109 

CRAI89000000198627 

CRAI18000005076419 

CRAI 18000004912300 

CRAI 98000043536338 

CRAI18000004929618 

CRAI18000004952869 

CRAI18000005221564 



7 /altid=gi 
/altid=gi j 7 
/altid=gi | 
/altid-gi I 
/altid=gi I 
/altid=gi I 
/altid=gi j 
/altid=gi j 
/altid=gi| 
/altid-gi 1 
/altid-gil 



110946770 /def=ref |NP_067386. 1 1 RA. 
657492 /def=ref |NP_055168.1 I RAB26. 
1710022 /def=sp| P51156|RB26_RAT RA. 
1083775 /def=pir II JC2528 GTP-bindi . 
7296421 /def-gb|AAF51708.1| (AE003. 
7438397 /def=pir | | T15123 hypotheti. 
134236 /def=sp|P20791|SAS2_DICDI G. 
12963499 /def =ref I NP_075615 . 1 1 cel. 
131798 /def=sp|P24407|RAB8_HUMAN R. 
131848 /def=sp|P22128|RAB8__DISOM R. 
4586580 /def-dbj IBAA76422.il {AB02 . 



Score 
425 
297 
294 
293 
273 
207 
203 
203 
202 
202 
202 



E 

e-117 
4e-79 
3e-78 
7e-78 
9e-72 
4e-52 
7e-51 
9e-51 
le-50 
2e-50 
2e-50 



BLAST dbEST hits: 

gi 1 13033710 /dataset-dbest /taxon=960 . . . 
gi 1 12785775 /dataset=dbest /taxon=960 . . . 
gi 1 12904236 /dataset=dbest /taxon=960 . . . 
gi I 9093496 /dataset=dbest /taxon=9606 . . . 



Score 
1318 
1316 
1035 
694 



E 

0.0 
0.0 
0.0 
0.0 



AC i 



EXPRESSION INFORMATION FOR MODULATORY USE: 

library source : 
From BLAST dbEST hits: 
gi 1 13033710 prostate 
gi|12785775 brain 

gi 112904236 T cells from T cell leukemia 
gi I 9093496 leukopheresis 



From tissue screening panels: 
leukocyte 
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1 MTGTPGAVAT RDGEAPERSP PCSPSYDLTG KVMLLGDTGV GKTCFLIQFK 

51 DGAFLSGTFI ATVGIDFRNK VVTVDGVRVK LQIWDTAGQE RFRSVTHAYY 

101 RDAQALLLLY DITNKSSFDN IRAWLTEIHE YAQRDVVIML LGNKADMSSE 

151 RVIRSEDGET LAREYGVPFL ETSAKTGMNV ELAFLAIAKE LKYRAGHQAD 
201 EPSFQIRDYV ESQKKRSSCC SFM (SEQ ID NO: 2) 



FEATXJRES : 

Functional domains and key regions : 

[1] PDOCOOOOl PSOOOOl ASN_GLYCOSYLATION 
N-glycosylation site 

114-117 NKSS 

[2] PDOC00004 PS00004 CAMP_PHOSPHO_SITE 

cAMP- and cGMP-dependent protein kinase phosphorylation site 

Number of matches: 2 

1 214-217 KKRS 

2 215-218 KRSS 

[3] PDGCGGG05 PSGGGG5 PKC_PHGSP1^0_SI~T-E 
Protein kinase C phosphorylation site 

Number of matches: 5 

1 29-31 TGK 

2 113-115 TNK 

3 149-151 SER 

4 173-175 SAK 

5 212-214 SQK 

[4] PDOC00006 PS00006 CK2_PH0SPH0_SITE 
Casein kinase II phosphorylation site 

116-119 SSFD 

[5] PDOC00008 PS00008 MYRISTYL 
N-myristoylation site 

Number of matches: 5 

1 3-8 GTPGAV 

2 6-11 GAVATR 

3 39-44 GVGKTC 

4 52-57 GAFLSG 

5 57-62 GTFIAT 

[6] PDOC00017 PS00017 ATP_GTP_A 
ATP/GTP-binding site motif A (P-loop) 

36-43 GDTGVGKT 

[7] PDOC00579 PS00675 SIGMA54_INTERACT_1 

Sigma-54 interaction domain ATP-binding region A signature 
32-45 VMLLGDTGVGKTCF 



Membrane spanning structure and domains: 

Helix Begin End Score Certainty 
1 48 68 0.715 Putative 
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BLAST Alignment to Top Hit: 

>CRA| 103000001517087 /altid-gi I 1094 6770 /def =ref I NP_067386 . 1 | RAB37, 
member of RAS oncogene family; GTPase Rab37 [Mus 
musculus] /org=Mus musculus /taxon=10090 /dataset=nraa 
/length=223 
Length = 223 



Score = 425 bits (1081), Expect = e-117 

Identities = 209/223 (93%), Positives = 215/223 (95%) 

Frame = +3 



Query: 42 MTGTPGAVATRDGEAPERSPPCSPSYDLTGKVMLLGDTGVGKTCFLIQFKDGAFLSGTFI 221 

MTGTPGA DGEAPERSPP SP+YDLTGKVMLLGD+GVGKTCFLIQFKDGAFLSGTFI 
Sbjct: 1 MTGTPGAATAGDGEAPERSPPFSPNYDLTGKVMLLGDSGVGKTCFLIQFKDGAFLSGTFI 60 

Query: 222 ATVGIDFRNKVVTVDGVRVKLQIWDTAGQERFRSVTHAYYRDAQALLLLYDITNKSSFDN 401 

ATVGIDFRNKVVTVDG RVKLQIWDTAGQERFRSVTHAYYRDAQALLLLYDITN+SSFDN 
Sbjct : 61 ATVGIDFRNKVVTVDGARVKLQIWDTAGQERFRSVTHAYYRDAQALLLLYDITNQSSFDN 120 

Query: 4 02 IRAWLTEIHEYAQRDVVIMLLGNKADMSSERVIRSEDGETLAREYGVPFLETSAKTGMNV 581 

IRAWLTEIHEYAQRDWIMLLGNKAD+SSERVIRSEDGETLAREYGVPF+ETSAKTGMNV 
Sb j ct : 121 rRAWLTEXH£YAQRDV~V'IMLLGNKADVS SERVIRSE DGETLARE YGVP FI^'IET 3 AKTGMNV 1 30 

Query: 582 ELAFLAIAKELKYRAGHQADEPSFQIRDYVESQKKRSSCCSFM 710 

ELAFLAIAKELKYRAG Q DEPSFQIRDYVESQKKRSSCCSF+ 
Sbjct: 181 ELAFLAIAKELKYRAGRQPDEPSFQIRDYVESQKKRSSCCSFV 223 (SEQ ID NO: 4) 



Hxnmer search results (Pf am) : 

Model Description Score E-value N 

PF00071 Ras family 306.9 8.4e-90 1 

CE00060 CE00060 rab_ras_like 213.3 3.7e-60 1 

PF01142 Uncharacterized protein family UPF0024 2.6 3.4 1 



Parsed for domains: 

Model Domain seq-f seq-t hmm-f hmm-t score E-value 

CE00060 1/1 31 191 .. 25 193 . . 213.3 3.7e-60 

PF01142 1/1 185 201 .. 444 462 .] 2.6 3.4 

PF00071 1/1 31 223 .] 1 198 [] 306.9 8.4e-90 
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1 AGGGGAGAGA AAAGACCGCA 
51 TMTCCCAGC AATTTGGAAG 
101 AGTTCCAAAC CAGCCTGTCC 
151 AAAATTACCC AGGCGTGGTG 
201 GGCTGAGGCA GGAGAATTGC 
251 CCAAGATCCC GCCACTGCAC 
301 CTCCGGGAGC CCACGGCATT 
351 GCCTCTGGCA TCCAAATAGC 
4 01 GCAGCCTCCA CACTCCAGTC 
451 CCCAGCTGAG GTTATAGCAC 
501 TTTACTCTAC TGGAGATCAC 
551 CTGACAGAAA ATACCTCCAG 
601 GGCGTACAAA GTCAGTCCCC 
651 TACCCCTGCT CCTCACCTCC 
701 CAGTGGCTCA CGCCTGTAAT 
751 ACCGCCTGAG GTCAGGAATT 
801 CCGTCTCTAC TAAAAATACA 
851 GCACCTGTAA TCCCAGCTAC 
901 AACCCGGGAA GCAAAGGTTG 
951 AGGCTGGGCG ACAGAGTGAG 
1001 TACCCCTCTC CAGCTCTCCC 
1051 GTAA.TGCACC GGGGCTGTTC 
1101 GGAGATGGGG GTAATTGAAA 
1151 CTGCTACCGG CAACCCCAGC 
1201 GCCGGGTGCC TGGGATATCC 
1251 TCATGAATTT ATTTGGCTCC 
1301 AATAAGGCCG GCCCCGCCTT 
1351 CGCAGCCCTC TTGCGAAAGC 
1401 CAACAGGCCG GGGGAAATGA 
1451 CCGCACCAGG CAGAGGCCCC 
1501 CGAGTACGGG TTGGTAAACA 
1551 GACTTAACAG ATGAGGAAGT 
1601 AGGGGTCCCC AGCTCCCCGC 
1651 GGCTCGGGCG CCGCCTGCTG 
1701 CCGCAGCAGA CGGGGTCCCC 
1751 ACCACAGGGG ACCGGTCCCG 
1801 CGGAGCCGAG CCGGTGTTGC 
1851 CCCTGCGGCT CTGCGTGCCC 
1901 GTCGAGGGGG CGACGGGACG 
1951 GGAGGGGAGG AGTGGGCGGG 
2001 TTCGCCTGCG GGCCGGCACT 
2051 ACGCCAGGCG CCGTTGCCAC 
2101 GCCCTGCAGT CCGAGCTACG 
2151 CGTGAGACCC CCGCCCTCCT 
2201 GTTGGACTCA GCCCTTCCCC 
2251 GAGAGAGGGT CAGGACACAG 
2301 TCTGCTGGCT TGGTGGGGGC 
2351 TCCCCCCGCC GGCCCCAACT 
24 01 TGAGCCCCCG GGAGCCCATG 
2451 CGGGGCTCAG GCTGGGAAGG 
2501 GAGAGGACGT CCCCTGCTGG 
2551 AGCTAAGCTC TTAGTTGAGA 
2 601 CCCAGCTGCT CCCCTCCCAT 
2 651 TCCTCCCTGC AGTCGGAAGC 
2701 CCTGCAGGTC ACTTGGGAAT 
2751 GGTCTTTCGC ATCCCGCACT 
2801 TTGCTCCCTC TACAGGGGTG 
2851 GGACTGGTCT GTCGCCTTCC 
2901 AAACTGAAAG AACTGCTGAG 
2951 GGGTTGGGGA TGGAGTGAGG 
3001 AGGAGACCTG CCTCTCTCTG 
3051 CACCCTTGCT GGGAGCCTCA 
3101 CCCTGCAAGT CATCCGCCCA 



TACCAGGCCA GGTGCGGTGG CTCACGCTTG 
GCCAAGGCAG GCGTATCGCC TGAGGTCAGC 
AACATGGTGA AGTTCTCTAC TAAGT^^TACA 
GCGTGCACCT GTAGTCCCAG CTGCTCCAGA 
TTGAACCTGG GAGGCAGAGG CTGCAATGCG 
TCCAGCCTGG GCGACAGAGT GAGACTCCGT 
GAGCAAACCT CGGCATTATT TGCAGCAAGA 
AACCAACACC ACGCCTCTGT AGTGTGCTGC 
TGAGGCTCCC TGTTTGAGTC CCGCCCTATG 
GCTCACCTCC AGAAGAGGTA ACCCAAGCTC 
CTCTGTCCCC ACTCTGGGCG CTTCTCCCAG 
CTGATGTCAG AAAATACAGG GCTGGAGGCT 
ACAGGCCTAT GGTGGCCCAT AAGCCACGTC 
ACACCTAAGT TAAGAATTGC AGGCCGGGCG 
CCCAGCACTT TGGGAGGCTG AGGTGGGCGG 
TGAGACCAGC TTGGCCAACA TGGCAAAACC 
AAAAGAAAAA ATAGCCGGGC CTGATGTCGC 
TCCGGGAGAC TGAGGCGGGA GTATAGCTTG 
CAGTGAGGCG AGATCGCACC ACTGCACTCC 
ACTCTGTCTG AAAAAAAAAA AAAGTGCAGG 
CTCCCTACAC ATCCCTCAAA CCGTCCCGCT 
CTTGGT-Z^J^.CT TGZ^J\GCTGC-T TA-TAGAATGT 
GGTCGGCCCA GGCCACAGAG CCCCTGAGCT 
TGCACTCCCC ACTCTCTGTC ACCAGGAGCT 
TGGCAGCTCT GCTCAAAATG ATCTACGACT 
TCCTCGGGGC CAGGGTGAGT GTCATGGGTT 
CAGGAGCGGT CCACTGGGAG ATGTGTGCTG 
TCTCCCCTGG TGGGACATTC TGGGCACAAC 
GAGGTGATCC ATACTAAAGG GTCAAAGTCC 
AAAACACCGC AGCGTACATG TGCTGCAAGG 
AAACTATATT CAGATGAGCT CGGGCCGGGT 
GTCTCGGGGC CATCGGCGGA GGCGCAGCCC 
CTCGCCACCT GGGGACAGCC CACGGCCCGG 
TCGCGGTGCG CAGCGACTAC GGGAACTCTT 
GCGGCCCGCT CCCCCAGGGG CAAGCAAGCG 
GGGCTGGATG TGGCTCATGT CCGAAGCGCA 
TCAGGGAGGC TGCCCGCCCC TTCACGCAGA 
TCAGGGAACA GCAAGGTCCG AGCCGGTGTC 
GAGGGAGGAG CCTGAGGGGT CCCGGTCGAG 
GCGGGGGTGG GGGCCGTTCC CGCGCTCTCC 
GCTCACCTCT CGTCCAGGGA CATGACGGGC 
CCGGGATGGC GAGGCCCCCG AGCGCTCCCC 
ACCTCACGGG CAAGGTGGGT GGGCCTCTTC 
CGGCGCTAGC CCCTTCCTGG CTGCGTCTGG 
CAGGCAGCTG CGTCTCCCAG AGGAGGGAGG 
CCTCTGGGGC CGTCCCAAGC TCTAGGTGTC 
GGGTCGCGGA AGATCGCAAA AACTGAGTGA 
CAGTTCTCTT CTGCCACACT CTGGCAAATA 
CTTCTTGGTG AGGGTTAAGC GCGCAACTCT 
GCTGGGAGAT GGGGACCGAA CGGAGACTCG 
CAGAGGAACT GGCGTTAATG CCATTTTCCG 
TCTGACATCC AGGTTTAAGG CCTGATGTCC 
TCCACCCGCT GGAGGCACTG CCTCCCACCT 
CGCTCCTCCC AGAAGGATGT TGCCAGCCGG 
TTTTCGAACC TGAGAAAGAT TTCAGTGGTT 
TGAGAGAGCT CCAGGGCTGC TCTCTGGGGC 
TCCTGTATGG AAACAGGTAG GGACAGCAGT 
ATCTGTGTCC TTGGAGTGAG CGGGTACCAG 
GGAGCCTAGA GCTTCCACTC TTCCTCTGCA 
GCTGTCCTGG ATTCCGCTGC ATGGCCTTGA 
GGCCTCGGTT TCCTCCCCGA CACCAGGGCT 
GCCTCCACCC CAGTGTTTCG GGGGAAGCCA 
GAGCCGTTGA GATAGGCGTC CTGTGTGGGC 
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3151 TTGTGGCAGG AAATGGGCCC CTGCACCCTC GGAGAGGAGG AGCTGCTGTT 

3201 GGCCAGGCCC CAGGCTGAGG GGGACTGCCT GACCTTGTTG CCCTGCAAAC 

3251 CAGCTGGGTT GTTTGCCTAG GAGGTGGCCA GGCTAGGCAG CTGTTTGTGT 

3301 TTGGTGGAAT CACCGAGCTG GGTGGGTAGC TGGCATCGTT TGCTCAAGGC 

3351 AGCTGTGATC TGTAAAGTAC ACAAAGACTG GCCCTCCCTC CCTCCTTCCT 

34 01 GCTCCAGGGC TGGGACCCAG GAGCCAGGGA GGAGTGCAGG CTCCAGAAAG 

34 51 CTCCTATCCC CCACCCCTTC ATCTGTTCCC TGGCCAAGCG GCATTGGCCG 

3501 GAGAGTTGGT CCCCAGCCTC CCCGGGCCTG CCCCAGGGGA GTGAGTCCAG 

3551 GACCCTCTGA GAAAGCCTGG CAGGAGCTCC TTGGACCAGA CTAGGGGTGA 

3601 TGTGGCCCAC AGGCAGACAG TTCCCACCCT GGGCCACTCT TCCCTGGGTC 

3651 TTAGGTGATT CACCACGATG ATGGGCCCTA GCCATTAACA GACTCTAGAA 

3701 ATACCTCAAA GACATTATCC CTCCTCCTTC TACCCACTAT GGAAACCATG 

3751 CCACAGAAAG GTTAAGGAAT CTTCCTAAAG TCACACAGTA GGCCATTTAC 

3801 AAATCAAGAC CCATCCTTCA TACCCCTTCT GCTCAGCCAC CCCTGCCTCT 

3851 CCACCAGAGT TAACTAATGC CAGTACCCCA TGCCCACAAC AGGAATGCCT 

3901 TTGGGCTCCA CTGTCAATTT CAGAGCCTCA AAAATAATTC AAACCTAGTC 

3951 CCTGCTTAAC CCATTAAGCC ACCTAACCAG CAGCTGGGAA ATTCCAGCAT 

4001 TGGATCTAGA CCCCTGTTAT CCAAGATTGG AGAACAGTGG GACAAAGTGC 

4051 TCCTCTCCAC CATTCCTGCG TGTCCCTGGG GAAGATGAGC AGAGCAGAGC 

4101 CAGACAGTAA AGGAGAGGGC CACGCCCCCT CCACAGGTTA CCTCCTTGGT 

4151 ACTCCTGCCC GCACTACCCA CAGCAACCCC GGGATGCCGA TCTGCAGCCA 

4201 CATGTCCCAT GTGGGA.GGTT TCTGCT-GAAA GAACTTCCAA C-TACACATCT 

4251 CCCCACTTCA GTATAAATTT CAACCTTCCC TAATTCATGC AACCTTTTTT 

4301 TTTTTTTTTT TTTTTTGAGA CAGAGTGTCG CTCTGTCACC GAGGCTGGAG 

4351 TTCAGTGATG CAATCTCGGC TCACTGCAAC CTCTACCTCC TGGGTTCAAG 

4401 CTATTCTCCT GTCTCCGCCT CCCAAGTAAC TGGGACTACA GGCGTGTGCC 

4 4 51 ACCACTCCTG GCTAGTTTTT TGTATTTTTA GTAGAGATGG GGTTTCACCT 

4 501 TGTTGGTCAG GCTGGTCTCA AACTCCCAAC TCAGGTGATC CGTCCACTTG 

4 551 GGCACCCAAA ATGNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

4 601 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

4 651 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

4701 NNNNNNNNNN NNNNNNNNNN TTCAAGTACC AGCCTGGCCA ACATGGTAGA 

4751 AACCCCGTCT CTACTAAAAA TAAAAAATTA GCCAGGCGAG GTGGTGCATG 

4 801 CCTATAATCC CAGCTACTCA GGTAGGCTGA GGCAGGAGAA TCATTTAAAC 

4 851 CTGGGAGGTG GAGGTTGTGG TGAGCCAAGA TCTCGCCATT GCACTCCAGC 

4 901 CTGGGCAACA AGAGCAAAAC TCCGTCTCAA AAAAAAAAAG AAAGAAAGAA 

4 951 AGAAAGAAAC TTCCAAATAA ATGTTGTGAC ACAAAAAAAA AAACCCAAAC 

5001 AATATTCATT ATAGAGTATG CAAATGACCA TGCCCCACCC CCAGCAGATT 

5051 CTGATAGACT CCCTTGGGTG GGAATCCTTG TCCAATATAT TGACACTTCC 

5101 CTTTCCTGTC AGTATAGCCC AGCCCATGCG TGTACTCACG AGCGGACGAT 

5151 GGATGACACA AGTACACAGA GGGACGGAAT CCCTGCATGG TGTGGCTATG 

5201 GGCAAATGTG GCCACTGTCT AGATTGTGCA AATGTGGTGG TTCTCTGGGG 

5251 CCACAGAGCA CACTTGGGGA CCTGTTCATG GTGAGGTCTC AACTCCGGCC 

5301 TCTAGGAACT TGAATGAGGA CAGGAGGGTC AGAGGGAGAG CCTAGGAGGC 

5351 TGAGCCAAGG AGCGTGGAGA GGAGAGACAG GGTGAAGGTG GCGGCTGGCT 

5401 TTCTGGAAGC AGGTGGCCTT TGGTGCGGTC AGCATTCGTG CCAGCCCCCT 

54 51 CTTCTCTGAT CCTCTCCATG TGTCTCTCTC CTGGAATCCC AGAAGCTGCC 

5501 CCTGACTCCC CATTAACTGC CTCTGCCCCT ACCCCCTAGG TGATGCTTCT 

5551 GGGAGACACA GGCGTCGGCA AAACATGTTT CCTGATCCAA TTCAAAGACG 

5601 GGGCCTTCCT GTCCGGAACC TTCATAGCCA CCGTCGGCAT AGACTTCAGG 

5651 GTGAGGTGGC TGCAGGCACT TGCTTCCAGC AGAGAGCCAG GGCTGTGGCT 

5701 CAGGCATGGG GGGGTTGCCC CCACCTTGCT CACCCTGGCT CCCAGGGACT 

5751 CCCGAGGCTC ATGCCTGGAG GGCACACAAC CCGCTCCCCC AAGACCACAG 

5801 AGGTGGCCGG GTCAAAGGAG ACTGGGCAAG GTTGGCTCCT TGCCCAACTA 

5851 TAGGATGCAA AAAAATGAGA CTGAGTCTTC GATTCCAGCT CCATTCCTGG 

5901 GGGACTTCTC CCAAGCAGAG CAGCCGCAGG CACGGCATAA GCTGAATATC 

5951 TTGGCCCACA GAGCCCCTGC TCATTGCTCT CCTACCTGGG CCCCTTTGGA 

6001 AAGGCCTCAA AGGTCAATCA GTCTTTCTGG AGTTCCCAGA AAGCACAGCC 

6051 CTGCACTGGG TTTAAGAGCT GGGCTTGGGC CAGGCATGGT GGCTCTTGCC 

6101 TGTATTCCCA GCACTTTGGG AGGCCGAAGC GGTCAGATCA CAAGGTCAGG 

6151 AGTTTGAGAC CAGCCTGGCC AACATGGTGA AACCCCGTCT CTACTAAAAA 

6201 TACAAAAATT AGCCAGGTGT AGTGGCACGC TCCTGCAGTC CCAGCTACTC 

6251 GGGAGGCTGA GGCAGGAGAA TCGCTCAAAT CCGGGTGGTG GAGGTTGCAG 
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6301 TGAGCTGAGA TCGCGCCACT GCACTCCAGC CTGGGCAACA AAGTGAGACT 
6351 GCGTCTCAGA AAAAAAAAAA AAAAAAGAGC TGGGCTGGCC ATGTTGGGAG 
64 01 ACAGCAGCTC ACCAGGGACC CTCCCTCTCA CCTTGACGAC TCCATCTTAC 
64 51 AAATCTGCAT CAGGGATGCT AGACGCTGCA CACCTGAAGT GTTCAATAGA 
6501 GAAAAGGTCT CACCCTGGCA GGTGGGGCTC TACAGCTTCA AGCAGGCAGA 
6551 AAGCGAACAC TTCCTTCACT AGAGAATTAG TGGGCAGCTA AAGPJ\AAGGT 
6601 GCTGCTGCAG ATGTAGCCTC AGGTCCCCAG GATGCAGGCA AACACCCCAT 
6651 CTCCAGGGGC TCGGTCACAG TCCCAAGGCT AGGCTCCAGG AGAGGGAGAC 
67 01 CGAAGTGGGG AAAGGGCAGG GCCTCCAGCA GCAACCAGCC CTCCAGCCCT 
6751 GGGCTGCCTG ATCCCTGGAG AGAGCCAGGA TGTTTCTCAG GCTCCTCTTG 
6801 CCCTGCTGTT GTGAGAAGGC AGTTACAGTC CTCAGAAGGG ACGACTCCAC 
6851 AGTGGAGGTG TCTGGGTATG GGGTTCCTGC TGCCCTGATG GTATGATCTG 
6901 GCTGGAGACG GTTCTGGGGC TCACTGCACC CACTCTAGGC CTGGAGAGGG 
6951 AACAAGAGAG GACGTCTGCA GAGCTGAGGA GCCACATGAC TCCTGCCCTC 
7001 CCATCCTCTG CCTTTTTCTC TTTCAGAACA AGGTGGTGAC TGTGGATGGC 
7 051 GTGAGAGTGA AGCTGCAGGT GAGACCAGAG GCTGGAGTTG GGGAGGGAGG 
7101 ATGGAGGACC TGCCCTTCCT TCTCACCCTG AACCACAGGA GGCCTGCAGC 
7151 CCTGCCCTCC GCCTGGGGCA ATTTCCTGTG GGGCCCACGG GAGGAAATGG 
7201 CTTTTGTTTA TTTGACATCT GCAGAAAAAG CAGTTCCCAG GCACCCTCTC 
7251 ATCTATGAAC AGCAGCTCCA AATGCCTTCA GACAAGCTTA GCCTCCATCC 
7301 ATCTCCTCCC CAGTTGCCAG GGCTTTATCT GCTCTTAGGA GATTGGACAT 
7351 CCCCAACCCC TGAGCTAGGG GAGAGGAGAA GATTCTTTTT TTTTCTTTTC 
7 4 01 TTTTCTTTTT TTTTTTGAGA TGGAGTCTCG CTCTGTCGCC CAGGCTGGAG 
74 51 TGCAGTGGCA CAATCTCGGC TCACTGCAAC CTCTGCCTCC CAGGTTTAAG 
7501 AGATTCTCCT GCCTCAGCCT CCTGAGTAGC TGAGACTACA GGTGCATGCC 
.gl 7551 ACCACACCTG GCTAATTTTT TGTATTTTTA GTAGAGACGG GGTTTCACTG 

[^^ 7 601 TGTTAGCCAG GATGGTCTGG ATCTCCTGAC CTCGTGATCC GCCTGCCTCG 

'^l 7 651 GCCTCCCAAA GTGCTGGGAT TACAGGTGTA AGCCACCGCG CTCGGCTGAG 

'^'^ 7701 GAGATGATTT TGAACGAGCT TGAGAAATCA GTAACTGCTA CTGTCCAGGT 

7751 CATTGGATGC TCAGGGGCTC ATGAGAACCT AAAGAAGA7VA ACAGCCCCAC 
==;,J 7801 CTTCCCACAG ATATCTCATA CAACAAAGCA GGCCTGCTCC ACCCAGCACA 

7851 TTCCTTGCAC CTGCCTCCTT CTGACCATTT CTCCATCCCA TCCCTTCCCA 
7901 GATCTGGGAC ACCGCTGGGC AGGAACGGTT CCGAAGCGTC ACCCATGCTT 
-^•^ 7951 ATTACAGAGA TGCTCAGGGT GAGTCCCTCG CACCCTCCAA CCCCTACCCC 

^1;; 8001 AGCCCCTTGG TAGCATCCGT GCTGCTGCCT AAGTCCCCTC TGTGATCCTC 

ijj 8051 TCCCCTCCAG CCTTGCTTCT GCTGTATGAC ATCACCAACA AATCTTCTTT 

8101 CGACAACATC AGGGTAGGTC CTCCCTTCCC CTGACTCCCA CCCATAAGCA 
J^f 8151 GCCAAGGCAA GGTCTATGCA GGCTGGGGTT GCTTCCTGCC CTGTGGAAAG 

W 8201 CGGGTGGAGC GTGGAGTCCT CCTGCCTTCT GAAAAACACC TACTTGTGAC 

I1J 8251 TCAGAAGTCA TATCTGCTGC TTTGTATTTG GTGGCCATGT GGGCATGAAG 

\J 8301 GCCAAGCAGG CTGTTGTGAC ■ CCTGTGCCAC CTGCATAGCC CTCACTGTGA 

8351 TTCACGAGTG TGTTTCGTGA CAAAGTGTTC AGAACAGCCC CCACTCCACC 
•^i' 8401 CTGGATAATT ATCCACAGAG ACCAAGGGAA AAACACAACC AGAAAAGTCC 

84 51 ACACATACAT CCAGGGCAAG TTGCAAGAAA GTGACTCAGT CAGACAGAGT 
8501 GAGTGGTTGT ATCCTCACAA CCAAACTATT ATAGAGACAA AAATTTGATA 
8551 AATTCAAGCA CCAATTTTGT TCACGACATT GTATAGGTTT CATGAATCCC 
8601 CTGACCTCAA GGACAGTTTG CTGATAAGCA AACTAGGAGA ATAAAACGTT 
8651 TATATAGAAA GAGGAAAATC CATGGCACTC ATACTCCTAC CTCCAACCCC 
8701 ATGCTCATGG CAGACATCAC TAATCAATCA CAGTACTTTT GATCACTGAA 
8751 ACCCTTATGT GGTCTTAGAA TCTTTAACAG GACACTCCAA GAT^ATCACTG 
8801 CTGACAGCCA ACTGATTTGT GAGATAAGGT CTCCATGCAT CTGGATCTTC 
8851 CATAGAACTG ATAGTTGCAC AGCATAAAAT GGTGAGGGTG GGGCCATTGT 
8901 GGGTTGAGCC ACCAAGGAAG GCCATCCAGG CCTGGATGGG CCAGAACAAA 
8951 GGTACAGATG AGAGAACGCA CAGGGTATCG TGTTCAAGGT AGTGAGTAAC 
9001 TGAGGATAGT CAAACGGAGC AGAAGAAGAA AGGGGCAGCA GGAGGAAGAG 
9051 AATGCCAGTC TCGCACGCCC TCTCCCACAG GCCTGGCTCA CTGAGATTCA 
9101 TGAGTATGCC CAGAGGGACG TGGTGATCAT GCTGCTAGGC AACAAGGTGA 
9151 GTGGCTCCGG GGCAGGGTCA GCCCAGCCCT GCACTTCCTC AGCCCTAGCC 
9201 GGCCCCATAA CCACCCAAGA ACAGTTATCT AGGCATCCTT CCTGAAAAGG 
9251 ACTCTGCAGC CTCCAGCTCA GGGGTCAGAC ATATCTGGAG GCTTCTGCCC 
9301 ATCCCATCTG CCCCTTCCAG GGAAAGTCCA AGTTGTTGCC TGAGAAATCA 
9351 AGGGGTGCCC AGTTCTCAGC CCCCATTAGA GCAGAGTGAA CAGGGTCCCA 
94 01 GGTCAGGGGC TAAGAGTGCA AAGGGTTAGC CCCAACTGCT GTCCTATTCC 
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9451 AAGACCCTTT ACCA7\AGGTG AGATCCCAGA GCTGGGAGCT ACACTGGGCA 

9501 GAAACCCTGG CCCCAGGCCA ATCACACCTG CCTGCAGTCC CTTGGGCCAC 

9551 CAGCAGAGGG CAGGCAACGC CTGCTTCTGG GGCAAAATAT GGGCCCGCTG 

9601 GGGCGGAGGC CTCCTTCCCC AGAGTGACCC ATTTGGGCTT GACAGGCGGA 

9651 TATGAGCAGC GAAAGAGTGA TCCGTTCCGA AGACGGAGAG ACCTTGGCCA 

9701 GGGTAAGTGA TTGTCTGTGG GACAGGGTGA AGGGTGGGGG CAACCCGACG 

9751 CTGGCCCTGA GGACACTCTC TCCCGGGCAG GAGTACGGTG TTCCCTTCCT 

9801 GGAGACCAGC GCCAAGACTG GCATGAATGT GGAGTTAGCC TTTCTGGCCA 

9851 TCGCCAAGTG AGAGCTGGGC AGGGAAGGGA AGTGTGCGGG GCAGGGCGGC 

9901 ACACTCCAGG AATCCAGTAG GGCCCGGCCC CTGGCCCAGC CCCTGGACAC 

9951 ACCTGCATTC TGCAGGCTGA GGTCCATTTG CTCTGGGAGC ACTGGGCCAC 

10001 TGGGAGAGGG GAGGGGGCGG CTCAGCTCCT CACCCCAGCC CAGCCCAGCC 

10051 CAGCCCAGCC CATTGTCTCT TCTTCAAGGG AACTGAAATA CCGGGCCGGG 

10101 CATCAGGCGG ATGAGCCCAG CTTCCAGATC CGAGACTATG TAGAGTCCCA 

10151 GAAGAAGCGC TCCAGCTGCT GCTCCTTCAT GTGAATCCCA GGGGGCAGAG 

10201 AGGAGGCTCT GGAGGCACAC AGGATGCAGC CTTCCCCCTC CCAGGCCTGG 

10251 CTTATTCCAA GAGGCTGAGC CAATGGGGAG AAAGATGGAG GACTCACTGC 

10301 ACAGCCGCTT CCTAGCAGGG AGCTATACTC CAACTCCTAC TTGAGTTCCT 

10351 GCGGTCTCCC CGCATCCACA GGGAGGGTAA AACACTTAGC TTTTATTTTA 

10401 ATAGTACATA ATTTAATACC AAAAAAGGCG CCTGGATCCC CAAAAAACCG 

104 51 AGGCTGGGAG CTAGTGGCCC TTTTGCTTTC TAGGACTTGG GGGGCCGGCC 

10501 CTCCCTCCTA AGCATAACAA AGGTGGTGTT GCTCCAGCTC AGCCCCAGGG 

10551 GACACAGATG CACTTTGGGG GTGAGGGCAG GTAATGACTC CATCGCACCC 

10601 TCAGTTCAGC TGGACAGAGG CTCAGGTGAC CCCAGCCTTC ACTGTCTCCC 

10651 GCTCTCCAGG AGCTTATCTT CGCCCCATCT CCCAAATAAG TGGGCCCTTG 

10701 TGCTGTGAGG AAGACCAAAG CCTCAGGGAA GATAAGAGAT ATGGAGATGG 

10751 GAGGGGGAGG ACAAGGGGCA GAGAGTAGGG TCTAGCTGGC TATCTCTGGC 

10801 CTTACTAACA CCCCCCTGGA GGCATGCCCC TTTTCTCCAG CACACAAGCA 

10851 CATTGGGGCA CCTGGAAATA TTGGTTCCAG GCTCCTGTTC TCTGGACTTC 

10901 AGATCCTGGG GGAGCCCCTC CCCCCCCTGA ATCCCTGGCT TAGCTACCTT 

10951 CCTGCCTGTG CACCTAAAAA CCTCAGGTCA GAACTAGGAA AAGAGTTTTG 

11001 TTTTTATTTT TTTGAAATGG AGTCTCGTTC TGTCGCCCAG GCTGAGGTGC 

11051 AGTAGTGCAA TCTCCGCTCA CTACAACCTC CACTCCCTGG GGCTCAAGCG 

11101 ATCCTCCCAC CTCAGCCGCC GAAGTAGCTG GGACTATAGG TGTGTACCAT 

11151 CACACCTGGC TAATTTTTGT ATTTTTTGTA GACACAGGGT TTCGCCATGT 

11201 TGCCCAGGCT GGTCTTGAAT TCCTGAGCTC TVAGCAACCTG CCGGCCTCGG 

11251 CCTCCCAAAG TACTGGGATT ACACGCAGAA GGCACCATGC CCAGGCTAGA 

11301 TGTGTCTTAT CCCAATCCTT TGGCAGGCAT GCAGCTCCAC AGGCGATTTC 

11351 TTCAAGCAGC TGAAGTGTTT AGCCCTCCTG GGTTAAGAGC CAGATAAGGA 

11401 GT^AATCCCTT TCCTAGGTTT GGAATGTGTT GTGAAAAAAA AGAGAAATCC 

11451 CTGGCTCCTG GAGCTGGTGG GAGACAAGAT TAAGCAAACC TCCCCTGACA 

11501 TGTATCCCTT TGACCCCAAG CTCTGCCTCC TCCCTGACCA CCCATGCCCT 

11551 TTCCTTTAAC TTCTCAAACA GATACCAGGG CCTAAACTGC TTTACCTCCC 

11601 CTCCTACTGA GTCAGGTTAG GTGGTGGGAG GTCACCCATT TCCGAGTTAA 

11651 ACCAATGCAA TATGAGTAAA ACAAAGTCAT GTGGGTATGT CTGGGGTAGA 

11701 GAGAGGGGTA GCAAGTTCAT GTGTCCTCCT TGGTCACATA TCTCCCAAAG 

11751 CTCTGATCCC TGCCATGGGA AGTGGACAGG AAACATGAGG TCATGACCTG 

11801 CAGGCATCTT TACTGCAGCT CTGCCGGCCT GGAGGGGGAG AGGGGGAGGA 

11851 AGAAGTATGC GCTGCACATT TCTGAGGCTA CTGCATTTGC TTTCAAGGCA 

11901 GAAATCTTGC TCTGAGCAGT CAGCGGCTCC AGTTTGGGCC CGATAAGGAA 

11951 GTTCTCCGTG GCCTCCCTCA GGCAGAGCAG GGAGGAGGCT GACATTGCCA 

12001 GTCTCTTCTG GGGCCCAAGG CAGGTTGCAG GAGATCCAAT CCCATAGACA 

12051 GCTCTGGGCC TCTTGCATTT GAGTTTTTCA GAATTAAACT GCAGTATTTT 

12101 GGAAAGCACA TCCTGTCCAC TGTTTCTTTG AAGTGAGTGG GGGGGGGGGG 

12151 TCTTGTTGAA GGAATTGTCA TTCACTGCCA AAATCATTCC ATCCTCCTTC 

12201 CTCAGTGTCT GTCCTCAGAT GGTCAGCTCC CCGCTCAACA GACTGTCTCC 

12251 CGCCTCTGTG ACCAGCCTCT CTTTGGCAAG AGGGAGCTAG AAGGCTTTAC 

12301 AGTCCTAATC ATTTTTCTGT TGGAAAAAAA AAAAAAAAAC CAAGGCTCCT 

12351 TTCCCTGTGG CGTGTACCCA GAGGTTGATT ACCTGAGTCT GTCCTGCCTC 

12401 TCCCCACCCC ACCTCCCTAG CCAAACGCTG CTGCCAAAGC CCACGCTATT 

12451 GCCCTAGATG GCCTGTCTTC AGCGGGCTGC CCCTCGAGGT CCCAGGCTCT 

12501 CCGCGGAGCC CTCACCTTCC CAGCAGGGAT CAGAACCTGC ACTCCTCTAT 

12551 GCGAGTCCTG GGACAGCACA AAGTGGATTA GGGTTAGGGT TCCCACAAAC 
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12601 GGAAAAATGT TATTCAAACA 

12651 TTAATTCTCG AGACTGACCG 

12701 CCCGGCAACA GCCGCTCCCT 

12751 AAAGGCTGTT AAAACGTTGG 

12801 AGGGCGGATC ACCTGAGGTC 

12851 CGAMTTTCG TCTCTACTAA 

12901 CGCGCCTGTA ACCCCAGCTG 

12951 AACCCGGGAG GCGGAGGTTG 

13001 AGCCTGGGCG ACAGAGCGAG 

13051 GGGTCCTTTA CCCGAGGGCC 

13101 GAAACCAGGC CGGAGCCGGC 

13151 CGAGCCCCGC CCCTGCCACC 




ACTCTGTAGG GTCCGAGGAG GCCCTCCGTC 
GCCCTCGCTG CCCCGAGCGG GAGCAGTTGC 
CTCAACTGGA GCTGCACCCA GGCTTTGGCT 
CCAGGTGCGG AGGCTCACGT CTGTAATCCC 
AGGAGTTTGA AACCATCCTG GCCAACATGG 
AAATACAAAA ATTAGCGGGG CGTGGTGGTG 
CTCGGGAGGC TGAGGCAGGG GAATCGCTTG 
CAGTGATCCG AGATCGCGCC ACGGCAGTCC 
ACTCCGTCTC AAAAAAAAAA AAAAAAGTTA 
GGCTTTCCTC ACTCCCCGCC ACAGGTAGGG 
GGGCCCACCC GCCCAGAACC GGGAATTCGG 
CCAGCGCCGG CC (SEQ ID NO: 3) 
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ACCCATTAAGCCACCTAACCAGCAGCTGGGAAATTCCAGCATTGGATCTAGACCCCTGTT 
ATCCAAGATTGGAGAACAGTGGGAC7\AAGTGCTCCTCTCCACCATTCCTGCGTGTCCCTG 
GGGAAGATGAGCAGAGCAGAGCCAGACAGTTWVGGAGAGGGCCACGCCCCCTCCACAGGT 
TACCTCCTTGGTACTCCTGCCCGCACTACCCACAGCAACCCCGGGATGCCGATCTGCAGC 
CACATGTCCCATGTGGGAGGTTTCTGCTGAAAGAACTTCCAACTACACATCTCCCCACTT 
[C,T] 

AGTATAAATTTCAACCTTCCCTAATTCATGCAACCTTTTTTTTTTTTTTTTTTTTTTGAG 
ACAGAGTGTCGCTCTGTCACCGAGGCTGGAGTTCAGTGATGCAATCTCGGCTCACTGCAA 
CCTCTACCTCCTGGGTTCAAGCTATTCTCCTGTCTCCGCCTCCCAAGTAACTGGGACTAC 
AGGCGTGTGCCACCACTCCTGGCTAGTTTTTTGTATTTTTAGTAGAGATGGGGTTTCACC 
TTGTTGGTCAGGCTGGTCTCAAACTCCCAACTCAGGTGATCCGTCCACTTGGGCACCCAA 

GATTGGAGAACAGTGGGACAAAGTGCTCCTCTCCACCATTCCTGCGTGTCCCTGGGGAAG 
ATGAGCAGAGCAGAGCCAGACAGTAAAGGAGAGGGCCACGCCCCCTCCACAGGTTACCTC 
CTTGGTACTCCTGCCCGCACTACCCACAGCAACCCCGGGATGCCGATCTGCAGCCACATG 
TCCCATGTGGGAGGTTTCTGCTGAAAGAACTTCCAACTACACATCTCCCCACTTCAGTAT 
AAATTTCAACCTTCCCTAATTCATGCAACCTTTTTTTTTTTTTTTTTTTTTTGAGACAGA 



TGTCGCTCTGTCACCGAGGCTGGAGTTCAGTGATGCAATCTCGGCTCACTGCAACCTCTA 
CCTCCTGGGTTCAAGCTATTCTCCTGTCTCCGCCTCCCAAGTAACTGGGACTACAGGCGT 
GTGCCACCACTCCTGGCTAGTTTTTTGTATTTTTAGTAGAGATGGGGTTTCACCTTGTTG 
GTCAGGCTGGTCTCAAACTCCCAACTCAGGTGATCCGTCCACTTGGGCACCCAAAATG 

TGCTCCTCTCCACCATTCCTGCGTGTCCCTGGGGAAGATGAGCAGAGCAGAGCCAGACAG 
TAAAGGAGAGGGCCACGCCCCCTCCACAGGTTACCTCCTTGGTACTCCTGCCCGCACTAC 
CCACAGCAACCCCGGGATGCCGATCTGCAGCCACATGTCCCATGTGGGAGGTTTCTGCTG 
AAAGAACTTCCAACTACACATCTCCCCACTTCAGTATAAATTTCAACCTTCCCTAATTCA 
TGCAACCTTTTTTTTTTTTTTTTTTTTTTGAGACAGAGTGTCGCTCTGTCACCGAGGCTG 
[G,A] 

AGTTCAGTGATGCAATCTCGGCTCACTGCAACCTCTACCTCCTGGGTTCAAGCTATTCTC 
CTGTCTCCGCCTCCCAAGTAACTGGGACTACAGGCGTGTGCCACCACTCCTGGCTAGTTT 
TTTGTATTTTTAGTAGAGATGGGGTTTCACCTTGTTGGTCAGGCTGGTCTCAAACTCCCA 
ACTCAGGTGATCCGTCCACTTGGGCACCCAAAATG 

TTCAAGTACCAGCCTGGCCAACATGGTAGAAACCCCGTCTCTACTAAAAATAAAAAATTA 
GCCAGGCGAGGTGGTGCATGCCTATAATCCCAGCTACTCAGGTAGGCTGAGGCAGGAGAA 
TCATTTAAACCTGGGAGGTGGAGGTTGTGGTGAGCCAAGATCTCGCCATTGCACTCCAGC 
CTGGGCAACAAGAGCAAAACTCC 



TCTCAAAAAAAAAAAGAAAGAAAGAAAGAAAGAAACTTCCAAATAAATGTTGTGACACAA 
AAAAAAAAACCCAAACAATATTCATTATAGAGTATGCAAATGACCATGCCCCACCCCCAG 
CAGATTCTGATAGACTCCCTTGGGTGGGAATCCTTGTCCAATATATTGACACTTCCCTTT 
CCTGTCAGTATAGCCCAGCCCATGCGTGTACTCACGAGCGGACGATGGATGACACAAGTA 
CACAGAGGGACGGAATCCCTGCATGGTGTGGCTATGGGCAAATGTGGCCACTGTCTAGAT 

TTCAAGTACCAGCCTGGCCAACATGGTAGAAACCCCGTCTCTACTAAAAATAAAAAATTA 
GCCAGGCGAGGTGGTGCATGCCTATAATCCCAGCTACTCAGGTAGGCTGAGGCAGGAGAA 
TCATTTAAACCTGGGAGGTGGAGGTTGTGGTGAGCCAAGATCTCGCCATTGCACTCCAGC 
CTGGGCAACAAGAGCAAAACTCCGTCTCAAAAAAAAAAAGAAAGAAAGAAAGAAAGAAAC 
TTCCAAATAAATGTTGTGACAC 



AAAAAAAAAACCCAAACAATATTCATTATAGAGTATGCAAATGACCATGCCCCACCCCCA 
GCAGATTCTGATAGACTCCCTTGGGTGGGAATCCTTGTCCAATATATTGACACTTCCCTT 
TCCTGTCAGTATAGCCCAGCCCATGCGTGTACTCACGAGCGGACGATGGATGACACAAGT 
ACACAGAGGGACGGAATCCCTGCATGGTGTGGCTATGGGCAAATGTGGCCACTGTCTAGA 
TTGTGCAAATGTGGTGGTTCTCTGGGGCCACAGAGCACACTTGGGGACCTGTTCATGGTG 

CACCAGGGACCCTCCCTCTCACCTTGACGACTCCATCTTACAAATCTGCATCAGGGATGC 
TAGACGCTGCACACCTGAAGTGTTCAATAGAGAAAAGGTCTCACCCTGGCAGGTGGGGCT 



[G,A] 



[-,A] 



FIGURE 




1 



f 



CTACAGCTTCAAGCAGGCAGAAAGCGAACACTTCCTTCAeTAGAGAATTAGTGGGCAGCT 
AAAGAAAAGGTGCTGCTGCAGATGTAGCCTCAGGTCCCCAGGATGCAGGCAAACACCCCA 
TCTCCAGGGGCTCGGTCACAGTCCCAAGGCTAGGCTCCAGGAGAGGGAGACCGAAGTGGG 
[A,G] 

AAAGGGCAGGGCCTCCAGCAGCAACCAGCCCTCCAGCCCTGGGCTGCCTGATCCCTGGAG 
AGAGCCAGGATGTTTCTCAGGCTCCTCTTGCCCTGCTGTTGTGAGAAGGCAGTTACAGTC 
CTCAGAAGGGACGACTCCACAGTGGAGGTGTCTGGGTATGGGGTTCCTGCTGCCCTGATG 
GTATGATCTGGCTGGAGACGGTTCTGGGGCTCACTGCACCCACTCTAGGCCTGGAGAGGG 
AACAAGAGAGGACGTCTGCAGAGCTGAGGAGCCACATGACTCCTGCCCTCCCATCCTCTG 

8 624 GTGCCACCTGCATAGCCCTCACTGTGATTCACGAGTGTGTTTCGTGACAAAGTGTTCAGA 
ACAGCCCCCACTCCACCCTGGATAATTATCCACAGAGACCAAGGGAAAAACACAACCAGA 
AAAGTCCACACATACATCCAGGGCAAGTTGCAAGAAAGTGACTCAGTCAGACAGAGTGAG 
TGGTTGTATCCTCACAACCAAACTATTATAGAGACAAAAATTTGATAAATTCAAGCACCA 
ATTTTGTTCACGACATTGTATAGGTTTCATGAATCCCCTGACCTCAAGGACAGTTTGCTG 
[A,G] 

TAAGCAAACTAGGAGAATAAAACGTTTATATAGAAAGAGGAAAATCCATGGCACTCATAC 
TCCTACCTCCAACCCCATGCTCATGGCAGACATCACTAATCAATCACAGTACTTTTGATC 
ACTGAAACCCTTATGTGGTCTTAGAATCTTTTU^CAGGACACTCCAAGAAATCACTGCTGA 
CAGCCAACTGATTTGTGAGATAAGGTCTCCATGCATCTGGATCTTCCATAGAACTGATAG 
TTGCACAGCATATVAATGGTGAGGGTGGGGCCATTGTGGGTTGAGCCACCAAGGAAGGCCA 

8 661 TGTTTCGTGACAAAGTGTTCAGAACAGCCCCCACTCCACCCTGGATAATTATCCACAGAG 
ACCAAGGGAAAAACACAACCAGAAAAGTCCACACATACATCCAGGGCAAGTTGCAAGAAA 
GTGACTCAGTCAGACAGAGTGAGTGGTTGTATCCTCACAACCAAACTATTATAGAGACAA 
pa AAATTTGATAAATTCAAGCACCAATTTTGTTCACGACATTGTATAGGTTTCATGAATCCC 
"Z'l CTGACCTCAAGGACAGTTTGCTGATAAGCAAACTAGGAG7U\TAAAACGTTTATATAGAAA 
''^i [G,A] 

01 AGGAAAATCCATGGCACTCATACTCCTACCTCCAACCCCATGCTCATGGCAGACATCACT 
AATCAATCACAGTACTTTTGATCACTGAAACCCTTATGTGGTCTTAGAATCTTTAACAGG 

■<:^i ACACTCCAAGAAATCACTGCTGACAGCCAACTGATTTGTGAGATAAGGTCTCCATGCATC 
TGGATCTTCCATAGAACTGATAGTTGCACAGCATAAAATGGTGAGGGTGGGGCCATTGTG 

f;^ GGTTGAGCCACCAAGGAAGGCCATCCAGGCCTGGATGGGCCAGAACAAAGGTACAGATGA 

11754 GCTCCTGGAGCTGGTGGGAGACAAGATTAAGCAAACCTCCCCTGACATGTATCCCTTTGA 
CCCCAAGCTCTGCCTCCTCCCTGACCACCCATGCCCTTTCCTTTAACTTCTCAAACAGAT 
ACCAGGGCCTAAACTGCTTTACCTCCCCTCCTACTGAGTCAGGTTAGGTGGTGGGAGGTC 
^^''•J ACCCATTTCCGAGTTAAACCAATGCAATATGAGTAAAACAAAGTCATGTGGGTATGTCTG 
yj GGGTAGAGAGAGGGGTAGCAAGTTCATGTGTCCTCCTTGGTCACATATCTCCCAAAGCTC 
ilJ tT,C] 

^^j GATCCCTGCCATGGGAAGTGGACAGGAAACATGAGGTCATGACCTGCAGGCATCTTTACT 
GCAGCTCTGCCGGCCTGGAGGGGGAGAGGGGGAGGAAGAAGTATGCGCTGCACATTTCTG 
AGGCTACTGCATTTGCTTTCAAGGCAGAAATCTTGCTCTGAGCAGTCAGCGGCTCCAGTT 

ff'^ TGGGCCCGATAAGGAAGTTCTCCGTGGCCTCCCTCAGGCAGAGCAGGGAGGAGGCTGACA 
TTGCCAGTCTCTTCTGGGGCCCAAGGCAGGTTGCAGGAGATCCAATCCCATAGACAGCTC 

11836 GACCACCCATGCCCTTTCCTTTAACTTCTCAAACAGATACCAGGGCCTAAACTGCTTTAC 
CTCCCCTCCTACTGAGTCAGGTTAGGTGGTGGGAGGTCACCCATTTCCGAGTTAAACCAA 
TGCAATATGAGTAAAACAAAGTCATGTGGGTATGTCTGGGGTAGAGAGAGGGGTAGCAAG 
TTCATGTGTCCTCCTTGGTCACATATCTCCCAAAGCTCTGATCCCTGCCATGGGAAGTGG 
ACAGGAAACATGAGGTCATGACCTGCAGGCATCTTTACTGCAGCTCTGCCGGCCTGGAGG 
[A,G] 

GGAGAGGGGGAGGAAGAAGTATGCGCTGCACATTTCTGAGGCTACTGCATTTGCTTTCAA 
GGCAGAAATCTTGCTCTGAGCAGTCAGCGGCTCCAGTTTGGGCCCGATAAGGAAGTTCTC 
CGTGGCCTCCCTCAGGCAGAGCAGGGAGGAGGCTGACATTGCCAGTCTCTTCTGGGGCCC 
AAGGCAGGTTGCAGGAGATCCAATCCCATAGACAGCTCTGGGCCTCTTGCATTTGAGTTT 
TTCAGAATTAAACTGCAGTATTTTGGAAAGCACATCCTGTCCACTGTTTCTTTGAAGTGA 
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